Overview

Dataset Statistics

Number of Variables 10
Number of Rows 6.3512e+06
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 1.5 GB
Average Row Size in Memory 255.4 B
Variable Types
  • Numerical: 6
  • Categorical: 4

Dataset Insights

amount is skewed Skewed
oldbalanceOrig is skewed Skewed
newbalanceOrig is skewed Skewed
oldbalanceDest is skewed Skewed
newbalanceDest is skewed Skewed
nameOrig has a high cardinality: 6341907 distinct values High Cardinality
nameDest has a high cardinality: 2716810 distinct values High Cardinality
isFraud has constant length 1 Constant Length
oldbalanceOrig has 2101614 (33.09%) zeros Zeros
newbalanceOrig has 3603682 (56.74%) zeros Zeros
oldbalanceDest has 2698585 (42.49%) zeros Zeros
newbalanceDest has 2434670 (38.33%) zeros Zeros
  • 1
  • 2

Variables


step

numerical

Approximate Distinct Count 699
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 242.5553
Minimum 1
Maximum 699
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • step is skewed right (γ1 = 0.3382)

Quantile Statistics

Minimum 1
5-th Percentile 16
Q1 155
Median 238
Q3 333
95-th Percentile 471
Maximum 699
Range 698
IQR 178

Descriptive Statistics

Mean 242.5553
Standard Deviation 141.0676
Variance 19900.078
Sum 1.5405e+09
Skewness 0.3382
Kurtosis 0.246
Coefficient of Variation 0.5816
  • step has 92146 outliers

type

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 459968462

Length

Mean 7.4224
Standard Deviation 0.532
Median 7
Minimum 5
Maximum 8

Sample

1st row PAYMENT
2nd row PAYMENT
3rd row TRANSFER
4th row CASH_OUT
5th row PAYMENT

Letter

Count 43510683
Lowercase Letter 0
Space Separator 0
Uppercase Letter 43510683
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (CASH_OUT, PAYMENT) take over 50.0%

amount

numerical

Approximate Distinct Count 5308896
Approximate Unique (%) 83.6%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 179815.536
Minimum 0
Maximum 9.2446e+07
Zeros 12
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • amount is skewed right (γ1 = 31.0509)

Quantile Statistics

Minimum 0
5-th Percentile 2379.2308
Q1 13711.875
Median 76802.4921
Q3 211948.39
95-th Percentile 549581.2888
Maximum 9.2446e+07
Range 9.2446e+07
IQR 198236.515

Descriptive Statistics

Mean 179815.536
Standard Deviation 603630.9774
Variance 3.6437e+11
Sum 1.142e+12
Skewness 31.0509
Kurtosis 1803.4093
Coefficient of Variation 3.3569
  • amount is not normally distributed (p-value 4.233647847022867e-25)
  • amount has 328243 outliers

nameOrig

categorical

Approximate Distinct Count 6341907
Approximate Unique (%) 99.9%
Missing 0
Missing (%) 0.0%
Memory Size 479402683

Length

Mean 10.4823
Standard Deviation 0.6041
Median 11
Minimum 5
Maximum 11

Sample

1st row C1231006815
2nd row C1666544295
3rd row C1305486145
4th row C840083671
5th row C2048537720

Letter

Count 6351193
Lowercase Letter 0
Space Separator 0
Uppercase Letter 6351193
Dash Punctuation 0
Decimal Number 60223945

oldbalanceOrig

numerical

Approximate Distinct Count 1844245
Approximate Unique (%) 29.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 834795.684
Minimum 0
Maximum 5.9585e+07
Zeros 2101614
Zeros (%) 33.1%
Negatives 0
Negatives (%) 0.0%
  • oldbalanceOrig is skewed right (γ1 = 5.2438)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 14709.52
Q3 112239.1408
95-th Percentile 6.1874e+06
Maximum 5.9585e+07
Range 5.9585e+07
IQR 112239.1408

Descriptive Statistics

Mean 834795.684
Standard Deviation 2.89e+06
Variance 8.3519e+12
Sum 5.3019e+12
Skewness 5.2438
Kurtosis 32.8754
Coefficient of Variation 3.4619
  • oldbalanceOrig is not normally distributed (p-value 4.550751693966302e-25)
  • oldbalanceOrig has 1093905 outliers

newbalanceOrig

numerical

Approximate Distinct Count 2677400
Approximate Unique (%) 42.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 856169.5828
Minimum 0
Maximum 4.9585e+07
Zeros 3603682
Zeros (%) 56.7%
Negatives 0
Negatives (%) 0.0%
  • newbalanceOrig is skewed right (γ1 = 5.1724)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 0
Q3 151333.6825
95-th Percentile 6.3408e+06
Maximum 4.9585e+07
Range 4.9585e+07
IQR 151333.6825

Descriptive Statistics

Mean 856169.5828
Standard Deviation 2.9261e+06
Variance 8.5619e+12
Sum 5.4377e+12
Skewness 5.1724
Kurtosis 32.0038
Coefficient of Variation 3.4176
  • newbalanceOrig is not normally distributed (p-value 4.520986621860094e-25)
  • newbalanceOrig has 1028661 outliers

nameDest

categorical

Approximate Distinct Count 2716810
Approximate Unique (%) 42.8%
Missing 0
Missing (%) 0.0%
Memory Size 479399126

Length

Mean 10.4817
Standard Deviation 0.6048
Median 11
Minimum 2
Maximum 11

Sample

1st row M1979787155
2nd row M2044282225
3rd row C553264065
4th row C38997010
5th row M1230701703

Letter

Count 6351193
Lowercase Letter 0
Space Separator 0
Uppercase Letter 6351193
Dash Punctuation 0
Decimal Number 60220388

oldbalanceDest

numerical

Approximate Distinct Count 3609168
Approximate Unique (%) 56.8%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 1.101e+06
Minimum 0
Maximum 3.5602e+08
Zeros 2698585
Zeros (%) 42.5%
Negatives 0
Negatives (%) 0.0%
  • oldbalanceDest is skewed right (γ1 = 19.9342)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 139403.67
Q3 964184.4325
95-th Percentile 5.558e+06
Maximum 3.5602e+08
Range 3.5602e+08
IQR 964184.4325

Descriptive Statistics

Mean 1.101e+06
Standard Deviation 3.3989e+06
Variance 1.1553e+13
Sum 6.9929e+12
Skewness 19.9342
Kurtosis 950.0152
Coefficient of Variation 3.087
  • oldbalanceDest is not normally distributed (p-value 4.3723901720336425e-25)
  • oldbalanceDest has 768760 outliers

newbalanceDest

numerical

Approximate Distinct Count 3549046
Approximate Unique (%) 55.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 101619088
Mean 1.2254e+06
Minimum 0
Maximum 3.5618e+08
Zeros 2434670
Zeros (%) 38.3%
Negatives 0
Negatives (%) 0.0%
  • newbalanceDest is skewed right (γ1 = 19.3623)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 221924.89
Q3 1.1394e+06
95-th Percentile 5.8341e+06
Maximum 3.5618e+08
Range 3.5618e+08
IQR 1.1394e+06

Descriptive Statistics

Mean 1.2254e+06
Standard Deviation 3.6743e+06
Variance 1.35e+13
Sum 7.7826e+12
Skewness 19.3623
Kurtosis 863.0754
Coefficient of Variation 2.9985
  • newbalanceDest is not normally distributed (p-value 4.39830897580047e-25)
  • newbalanceDest has 719128 outliers

isFraud

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 419178738
  • The largest value (0) is over 822.01 times larger than the second largest value (1)

Length

Mean 1
Standard Deviation 0
Median 1
Minimum 1
Maximum 1

Sample

1st row 0
2nd row 0
3rd row 1
4th row 1
5th row 0

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 6351193
  • The top 2 categories (0, 1) take over 50.0%
  • The largest value (0) is over 822.01 times larger than the second largest value (1)
  • isFraud has words of constant length

Interactions

Correlations

Missing Values